NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

RayZer: A Self-supervised Large View Synthesis Model

Jiang, Hanwen; Tan, Hao; Wang, Peng; Jin, Haian; Zhao, Yue; Bi, Sai; Zhang, Kai; Luan, Fujun; Sunkavalli, Kalyan; Huang, Qixing; et al (October 2025, IEEE/CVF, International Conference on Computer Vision)

Free, publicly-accessible full text available October 15, 2026
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

Jin, Haian; Jiang, Hanwen; Tan, Hao; Zhang, Kai; Bi, Sai; Zhang, Tianyuan; Luan, Fujun; Snavely, Noah; Xu, Zexiang (April 2025, International Conference on Learning Representations (ICLR))

We propose the Large View Synthesis Model (LVSM), a novel transformer-based approach for scalable and generalizable novel view synthesis from sparse-view inputs. We introduce two architectures: (1) an encoder-decoder LVSM, which encodes input image tokens into a fixed number of 1D latent tokens, functioning as a fully learned scene representation, and decodes novel-view images from them; and (2) a decoder-only LVSM, which directly maps input images to novel-view outputs, completely eliminating intermediate scene representations. Both models bypass the 3D inductive biases used in previous methods—from 3D representations (e.g., NeRF, 3DGS) to network designs (e.g., epipolar projections, plane sweeps)—addressing novel view synthesis with a fully data-driven approach. While the encoder-decoder model offers faster inference due to its independent latent representation, the decoder-only LVSM achieves superior quality, scalability, and zero-shot generalization, outperforming previous state-of-the-art methods by 1.5 to 3.5 dB PSNR. Comprehensive evaluations across multiple datasets demonstrate that both LVSM variants achieve state-of-the-art novel view synthesis quality. Notably, our models surpass all previous methods even with reduced computational resources (1-2 GPUs).
more » « less
Free, publicly-accessible full text available April 24, 2026
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Jiang, Hanwen; Xu, Zexiang; Xie, Desai; Chen, Ziwen; Jin, Haian; Luan, Fujun; Shu, Zhixin; Zhang, Kai; Bi, Sai; Sun, Xin; et al (June 2025, IEEE/CVF International Conference on Computer Vision)

Free, publicly-accessible full text available June 1, 2026
Neural Gaffer: Relighting Any Object via Diffusion

Jin, Haian; Li, Yuan; Luan, Fujun; Xiangli, Yuanbo; Bi, Sai; Zhang, Kai; Xu, Zexiang; Sun, Jin; Snavely, Noah (December 2024, Conference on Neural Information Processing Systems (NeurIPS))

Full Text Available
Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Wu, Liwen; Bi, Sai; Xu, Zexiang; Luan, Fujun; Zhang, Kai; Georgiev, Ilyan; Sunkavalli, Kalyan; Ramamoorthi, Ravi (June 2024, CVPR 24)

Full Text Available
Differentiable Rendering of Neural SDFs through Reparameterization

https://doi.org/10.1145/3550469.3555397

Bangaru, Sai Praveen; Gharbi, Michael; Luan, Fujun; Li, Tzu-Mao; Sunkavalli, Kalyan; Hasan, Milos; Bi, Sai; Xu, Zexiang; Bernstein, Gilbert; Durand, Fredo (November 2022, ACM transactions on graphics)
Neural Free‐Viewpoint Relighting for Glossy Indirect Illumination

https://doi.org/10.1111/cgf.14885

Raghavan, Nithin; Xiao, Yan; Lin, Kai‐En; Sun, Tiancheng; Bi, Sai; Xu, Zexiang; Li, Tzu‐Mao; Ramamoorthi, Ravi (July 2023, Computer Graphics Forum)

Abstract Precomputed Radiance Transfer (PRT) remains an attractive solution for real‐time rendering of complex light transport effects such as glossy global illumination. After precomputation, we can relight the scene with new environment maps while changing viewpoint in real‐time. However, practical PRT methods are usually limited to low‐frequency spherical harmonic lighting. All‐frequency techniques using wavelets are promising but have so far had little practical impact. The curse of dimensionality and much higher data requirements have typically limited them to relighting with fixed view or only direct lighting with triple product integrals. In this paper, we demonstrate a hybrid neural‐wavelet PRT solution to high‐frequency indirect illumination, including glossy reflection, for relighting with changing view. Specifically, we seek to represent the light transport function in the Haar wavelet basis. For global illumination, we learn the wavelet transport using a small multi‐layer perceptron (MLP) applied to a feature field as a function of spatial location and wavelet index, with reflected direction and material parameters being other MLP inputs. We optimize/learn the feature field (compactly represented by a tensor decomposition) and MLP parameters from multiple images of the scene under different lighting and viewing conditions. We demonstrate real‐time (512 x 512 at 24 FPS, 800 x 600 at 13 FPS) precomputed rendering of challenging scenes involving view‐dependent reflections and even caustics.
more » « less
OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets

https://doi.org/10.1109/CVPR46437.2021.00711

Li, Zhengqin; Yu, Ting-Wei; Sang, Shen; Wang, Sarah; Song, Meng; Liu, Yuhan; Yeh, Yu-Ying; Zhu, Rui; Gundavarapu, Nitesh; Shi, Jia; et al (June 2021, IEEE/CVF Conference on Computer Vision and Pattern Recognition)

Full Text Available
Deep view synthesis from sparse photometric images

https://doi.org/10.1145/3306346.3323007

Xu, Zexiang; Bi, Sai; Sunkavalli, Kalyan; Hadap, Sunil; Su, Hao; Ramamoorthi, Ravi (July 2019, ACM Transactions on Graphics)

Full Text Available

Search for: All records